{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 下载环境"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "pip install transformers\n",
    "pip install modelscope\n",
    "pip install datasets\n",
    "pip install accelerate\n",
    "pip install bitsandbytes \n",
    "pip install peft\n",
    "pip install swanlab\n",
    "pip install sentencepiece\n",
    "pip install trl \n",
    "\n",
    "\n",
    "pip install unsloth"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 下载数据集"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import subprocess\n",
    "import os\n",
    "\n",
    "result = subprocess.run('bash -c \"source /etc/network_turbo && env | grep proxy\"', shell=True, capture_output=True, text=True)\n",
    "output = result.stdout\n",
    "for line in output.splitlines():\n",
    "    if '=' in line:\n",
    "        var, value = line.split('=', 1)\n",
    "        os.environ[var] = value\n",
    "\n",
    "from huggingface_hub import notebook_login\n",
    "\n",
    "notebook_login()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Using the latest cached version of the dataset since Ronndy/medical_o1_sft_Chinese couldn't be found on the Hugging Face Hub\n",
      "Found the latest cached dataset configuration 'default' at data/reason/Ronndy___medical_o1_sft_chinese/default/0.0.0/b9f185ad9b04dc86473b9798513089d6be910d4d (last modified on Mon May 12 18:41:33 2025).\n",
      "Using the latest cached version of the dataset since BAAI/IndustryInstruction_Health-Medicine couldn't be found on the Hugging Face Hub\n",
      "Found the latest cached dataset configuration 'default' at data/no_reason/BAAI___industry_instruction_health-medicine/default/0.0.0/010a7df17056ca13601ea15dfb7d753a4aac1efe (last modified on Mon May 12 16:54:20 2025).\n"
     ]
    }
   ],
   "source": [
    "from datasets import load_dataset\n",
    "\n",
    "\n",
    "ds_reason = load_dataset(\"Ronndy/medical_o1_sft_Chinese\",cache_dir='./data/reason')\n",
    "ds_no_reason = load_dataset(\"BAAI/IndustryInstruction_Health-Medicine\",cache_dir = './data/no_reason')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DatasetDict({\n",
       "    train: Dataset({\n",
       "        features: ['messages'],\n",
       "        num_rows: 17340\n",
       "    })\n",
       "    test: Dataset({\n",
       "        features: ['messages'],\n",
       "        num_rows: 3717\n",
       "    })\n",
       "})"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ds_reason"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'messages': [{'content': '你是一个医生，能够根据病人的表述为病人诊断病情，当然你也有跟人交流普通问题的情况。',\n",
       "   'role': 'system'},\n",
       "  {'content': '一刚出生新生儿,身体红,四肢青紫,心率90次/分,呼吸20次/分,不规则,四肢略屈曲,弹足底有皱眉。该新生儿Apgar评分为\\nA. 8分\\nB. 7分\\nC. 6分\\nD. 5分\\nE. 4分',\n",
       "   'role': 'user'},\n",
       "  {'content': '<think>\\n嗯，这个新生儿刚出生身体是红色的，但是四肢是青紫色的。看上去身体的血液循环似乎还不错，但是四肢稍微差一点。躯干部位在Apgar评分中可以算2分，四肢青紫得不了分，所以颜色这部分我认为总共可以算1分。\\n\\n哦，他的心率是90次每分钟。根据Apgar评分的标准，心率要超过100才可以得2分。90次不到100次但还不是最糟糕的情况下，可以算1分。\\n\\n然后是呼吸情况，宝宝的呼吸好像有点不规则。对于Apgar评分来说，不规则呼吸大概算是1分吧，而正常呼吸是2分的标准。这样看来，呼吸部分只得1分。\\n\\n肌张力方面，小家伙四肢略微屈曲。完全屈曲是最好的，得2分，但他只是稍微屈曲，所以这里给1分妥当。\\n\\n哦对，反射兴奋性这个也要看一下。他在弹足底时有皱眉的反应。这个皱眉呢，在评分中算是轻度反应，得1分。要是能有更明显的反应比如哭，那才能得2分。\\n\\n综合来看，这个小宝宝在评分上颜色1分，心率1分，呼吸1分，肌张力1分，反射兴奋性1分，总共加起来应该是5分。果然是D，没错。\\n</think>\\n<response>根据Apgar评分标准：\\n\\n1. 颜色：身体红、四肢青紫，计1分。\\n2. 心率：90次/分，计1分。\\n3. 呼吸：不规则，计1分。\\n4. 肌张力：四肢略屈曲，计1分。\\n5. 反射兴奋性：弹足底有皱眉，计1分。\\n\\n综合以上五项指标，该新生儿的Apgar评分为5分。因此，正确选项是D. 5分。</response>',\n",
       "   'role': 'assistant'}]}"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ds_reason['train'][0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "def convert_to_alpaca_format(examples):\n",
    "    conversations = []\n",
    "    for messages in examples[\"messages\"]:\n",
    "        # 提取系统提示词\n",
    "        system_prompt = next((msg[\"content\"] for msg in messages if msg[\"role\"] == \"system\"), \"\")\n",
    "        \n",
    "        # 提取用户问题和助手回答\n",
    "        user_question = next((msg[\"content\"] for msg in messages if msg[\"role\"] == \"user\"), \"\")\n",
    "        assistant_answer = next((msg[\"content\"] for msg in messages if msg[\"role\"] == \"assistant\"), \"\")\n",
    "        \n",
    "        # 分割think和response部分\n",
    "        think_part = \"\"\n",
    "        response_part = \"\"\n",
    "        if assistant_answer:\n",
    "            think_start = assistant_answer.find(\"<think>\")\n",
    "            think_end = assistant_answer.find(\"</think>\")\n",
    "            if think_start != -1 and think_end != -1:\n",
    "                think_part = assistant_answer[think_start+7:think_end].strip()\n",
    "            \n",
    "            response_start = assistant_answer.find(\"<response>\")\n",
    "            response_end = assistant_answer.find(\"</response>\")\n",
    "            if response_start != -1 and response_end != -1:\n",
    "                response_part = assistant_answer[response_start+10:response_end].strip()\n",
    "            else:\n",
    "                # 如果没有response标签，尝试获取think标签之后的内容\n",
    "                response_part = assistant_answer[think_end+8:].strip()\n",
    "        \n",
    "        # 构建Alpaca格式的对话\n",
    "        conversation = [\n",
    "            {\"role\": \"system\", \"content\": system_prompt},\n",
    "            {\"role\": \"user\", \"content\": user_question},\n",
    "            {\"role\": \"assistant\", \"content\": f'<think>{think_part}</think>{response_part}'}\n",
    "        ]\n",
    "        conversations.append(conversation)\n",
    "    \n",
    "    return {\"conversations\": conversations}\n",
    "\n",
    "# 批量处理数据集\n",
    "ds_reason_alpaca = ds_reason['train'].map(\n",
    "    convert_to_alpaca_format,\n",
    "    batched=True,\n",
    "    remove_columns=ds_reason[\"train\"].column_names,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'conversations': [{'content': '你是一个医生，能够根据病人的表述为病人诊断病情，当然你也有跟人交流普通问题的情况。',\n",
       "   'role': 'system'},\n",
       "  {'content': '一刚出生新生儿,身体红,四肢青紫,心率90次/分,呼吸20次/分,不规则,四肢略屈曲,弹足底有皱眉。该新生儿Apgar评分为\\nA. 8分\\nB. 7分\\nC. 6分\\nD. 5分\\nE. 4分',\n",
       "   'role': 'user'},\n",
       "  {'content': '<think>嗯，这个新生儿刚出生身体是红色的，但是四肢是青紫色的。看上去身体的血液循环似乎还不错，但是四肢稍微差一点。躯干部位在Apgar评分中可以算2分，四肢青紫得不了分，所以颜色这部分我认为总共可以算1分。\\n\\n哦，他的心率是90次每分钟。根据Apgar评分的标准，心率要超过100才可以得2分。90次不到100次但还不是最糟糕的情况下，可以算1分。\\n\\n然后是呼吸情况，宝宝的呼吸好像有点不规则。对于Apgar评分来说，不规则呼吸大概算是1分吧，而正常呼吸是2分的标准。这样看来，呼吸部分只得1分。\\n\\n肌张力方面，小家伙四肢略微屈曲。完全屈曲是最好的，得2分，但他只是稍微屈曲，所以这里给1分妥当。\\n\\n哦对，反射兴奋性这个也要看一下。他在弹足底时有皱眉的反应。这个皱眉呢，在评分中算是轻度反应，得1分。要是能有更明显的反应比如哭，那才能得2分。\\n\\n综合来看，这个小宝宝在评分上颜色1分，心率1分，呼吸1分，肌张力1分，反射兴奋性1分，总共加起来应该是5分。果然是D，没错。</think>根据Apgar评分标准：\\n\\n1. 颜色：身体红、四肢青紫，计1分。\\n2. 心率：90次/分，计1分。\\n3. 呼吸：不规则，计1分。\\n4. 肌张力：四肢略屈曲，计1分。\\n5. 反射兴奋性：弹足底有皱眉，计1分。\\n\\n综合以上五项指标，该新生儿的Apgar评分为5分。因此，正确选项是D. 5分。',\n",
       "   'role': 'assistant'}]}"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ds_reason_alpaca[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 注意事项"
   ]
  },
  {
   "cell_type": "raw",
   "metadata": {},
   "source": [
    "运行代码前先开启学术加速/梯子\n",
    "\n",
    "只能单卡启动，因为是free模式\n",
    "\n",
    "强行用device_map多卡启动也没用，训练的时候会报错\n",
    "\n",
    "sfttrainer中的 processing_class=tokenizer 改成 tokenizer=tokenizer 因为这里是unsloth包装过的sfttrainer\n",
    "\n",
    "用 python train.py运行，不要用deepspeed运行 （不要用deepspeed --include 'localhost:0' train.py 启动）"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
